监督的学习任务,例如GigaiPixel全幻灯片图像(WSIS)等癌症存活预测是计算病理学中的关键挑战,需要对肿瘤微环境的复杂特征进行建模。这些学习任务通常通过不明确捕获肿瘤内异质性的深层多企业学习(MIL)模型来解决。我们开发了一种新颖的差异池体系结构,使MIL模型能够将肿瘤内异质性纳入其预测中。说明了基于代表性补丁的两个可解释性工具,以探测这些模型捕获的生物学信号。一项针对癌症基因组图集的4,479吉普像素WSI的实证研究表明,在MIL框架上增加方差汇总可改善五种癌症类型的生存预测性能。
translated by 谷歌翻译
冷冻切片(FS)是手术操作期间组织微观评估的制备方法。该程序的高速允许病理学医师快速评估关键的微观特征,例如肿瘤边距和恶性地位,以引导手术决策,并尽量减少对操作过程的干扰。然而,FS容易引入许多误导性的人工结构(组织学人工制品),例如核冰晶,压缩和切割人工制品,妨碍了病理学家的及时和准确的诊断判断。额外的培训和长期经验通常需要对冻结部分进行高度有效和时间关键的诊断。另一方面,福尔马林固定和石蜡嵌入(FFPE)的黄金标准组织制备技术提供了显着优越的图像质量,而是一种非常耗时的过程(12-48小时),使其不适合术语用。在本文中,我们提出了一种人工智能(AI)方法,通过在几分钟内将冻结的整个幻灯片(FS-WSIS)计算冻结的整个幻灯片(FS-WSIS)来改善FS图像质量。 AI-FFPE将FS人工制品终止了注意力机制的指导,该引导机制在利用FS输入图像和合成的FFPE样式图像之间利用建立的自正则化机制,以及综合相关特征的合成的FFPE样式图像。结果,AI-FFPE方法成功地生成了FFPE样式图像,而不会显着扩展组织处理时间,从而提高诊断准确性。我们证明了使用各种不同的定性和定量度量,包括来自20个董事会认证的病理学家的视觉图灵测试的各种不同的定性和定量度量。
translated by 谷歌翻译
通用数据模型解决了标准化电子健康记录(EHR)数据的许多挑战,但无法将其集成深度表型所需的资源。开放的生物学和生物医学本体论(OBO)铸造本体论提供了可用于生物学知识的语义计算表示,并能够整合多种生物医学数据。但是,将EHR数据映射到OBO Foundry本体论需要大量的手动策展和域专业知识。我们介绍了一个框架,用于将观察性医学成果合作伙伴关系(OMOP)标准词汇介绍给OBO铸造本体。使用此框架,我们制作了92,367条条件,8,615种药物成分和10,673个测量结果的映射。域专家验证了映射准确性,并且在24家医院进行检查时,映射覆盖了99%的条件和药物成分和68%的测量结果。最后,我们证明OMOP2OBO映射可以帮助系统地识别可能受益于基因检测的未诊断罕见病患者。
translated by 谷歌翻译
Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.
translated by 谷歌翻译
Three main points: 1. Data Science (DS) will be increasingly important to heliophysics; 2. Methods of heliophysics science discovery will continually evolve, requiring the use of learning technologies [e.g., machine learning (ML)] that are applied rigorously and that are capable of supporting discovery; and 3. To grow with the pace of data, technology, and workforce changes, heliophysics requires a new approach to the representation of knowledge.
translated by 谷歌翻译
In the Earth's magnetosphere, there are fewer than a dozen dedicated probes beyond low-Earth orbit making in-situ observations at any given time. As a result, we poorly understand its global structure and evolution, the mechanisms of its main activity processes, magnetic storms, and substorms. New Artificial Intelligence (AI) methods, including machine learning, data mining, and data assimilation, as well as new AI-enabled missions will need to be developed to meet this Sparse Data challenge.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Recent work shows that the expressive power of Graph Neural Networks (GNNs) in distinguishing non-isomorphic graphs is exactly the same as that of the Weisfeiler-Lehman (WL) graph test. In particular, they show that the WL test can be simulated by GNNs. However, those simulations involve neural networks for the 'combine' function of size polynomial or even exponential in the number of graph nodes $n$, as well as feature vectors of length linear in $n$. We present an improved simulation of the WL test on GNNs with \emph{exponentially} lower complexity. In particular, the neural network implementing the combine function in each node has only a polylogarithmic number of parameters in $n$, and the feature vectors exchanged by the nodes of GNN consists of only $O(\log n)$ bits. We also give logarithmic lower bounds for the feature vector length and the size of the neural networks, showing the (near)-optimality of our construction.
translated by 谷歌翻译
Language models (LMs) now excel at many tasks such as few-shot learning, question answering, reasoning, and dialog. However, they sometimes generate unsupported or misleading content. A user cannot easily determine whether their outputs are trustworthy or not, because most LMs do not have any built-in mechanism for attribution to external evidence. To enable attribution while still preserving all the powerful advantages of recent generation models, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically finds attribution for the output of any text generation model and 2) post-edits the output to fix unsupported content while preserving the original output as much as possible. When applied to the output of several state-of-the-art LMs on a diverse set of generation tasks, we find that RARR significantly improves attribution while otherwise preserving the original input to a much greater degree than previously explored edit models. Furthermore, the implementation of RARR requires only a handful of training examples, a large language model, and standard web search.
translated by 谷歌翻译
在本文中,我们提出了一条基于截短的签名距离函数(TSDF)体积的接触点检测的新型抓紧管道,以实现闭环7度自由度(7-DOF)在杂物环境上抓住。我们方法的关键方面是1)提议的管道以多视图融合,接触点采样和评估以及碰撞检查,可提供可靠且无碰撞的7-DOF抓手姿势,并带有真实的碰撞 - 时间性能;2)基于接触的姿势表示有效地消除了基于正常方法的歧义,从而提供了更精确和灵活的解决方案。广泛的模拟和实体机器人实验表明,在模拟和物理场景中,就掌握成功率而言,提出的管道可以选择更多的反物和稳定的抓握姿势,并优于基于正常的基线。
translated by 谷歌翻译